Visual analytics for BigData variety and its behaviours
نویسندگان
چکیده
BigData, defined as structured and unstructured data containing images, videos, texts, audio and other forms of data collected from multiple datasets, is too big, too complex and moves too fast to analyze using traditional methods. This has given rise to a few issues that must be addressed; 1) how to analyze BigData across multiple datasets, 2) how to classify the different data forms, 3) how to identify BigData patterns based on its behaviours, 4) how to visualize BigData attributes in order to gain a better understanding of data. It is therefore necessary to establish a new framework for BigData analysis and visualization. In this paper, we have extended our previous works for classifying the BigData attributes into the „5Ws‟ dimensions based on different data behaviours. Our approach not only classifies BigData attributes for different data forms across multiple datasets, but also establishes the „5Ws‟ densities to represent the characteristics of data flow patterns. We use additional non-dimensional parallel axes in parallel coordinates to display the „5Ws‟ sending and receiving densities, which provide more analytic features for BigData analysis. The experiment shows that our approach with parallel coordinate visualization can be efficiently used for BigData analysis and visualization.
منابع مشابه
Quality-aware aggregation & predictive analytics at the edge
We investigate the quality of aggregation and predictive analytics in edge computing environments. Edge analytics require pushing processing and inference to the edge of a network of sensing & actuator nodes, which enables huge amount of contextual data to be processed in real time that would be prohibitively complex and costly to transfer on centralized locations. We propose a quality-aware, t...
متن کاملMix 'n' match multi-engine analytics
Current platforms fail to efficiently cope with the data and task heterogeneity of modern analytics workflows due to their adhesion to a single data and/or compute model. As a remedy, we present IReS, the Intelligent Resource Scheduler for complex analytics workflows executed over multi-engine environments. IReS is able to optimize a workflow with respect to a user-defined policy relying on cos...
متن کاملConquering Big Data with Spark
Today, big and small organizations alike collect huge amounts of data, and they do so with one goal in mind: extract "value" through sophisticated exploratory analysis, and use it as the basis to make decisions as varied as personalized treatment and ad targeting. To address this challenge, we have developed Berkeley Data Analytics Stack (BDAS), an open source data analytics stack for big data ...
متن کاملA Study of Data Management Technology for Handling Big Data
The amount of data is increasing daily. Data requires storage and effective processing for information retrieval. These both are challenge in case of the BigData due its velocity, variety and volume. It requires different management and efficient information retrieval schemes. There are different techniques available for the management of the Bigdata. The distribution of the storage and the pro...
متن کاملMaking Online BigData Small: Reducing Computation Cost and Latency in Web Analytics through Sampling
In the era of big data, the volume, velocity and variety of data are exploding at an unprecedented pace. With the explosion of data at the web, internet companies are working towards building powerful data analytics system that can crunch the big user data available to them and offer rich business insights. However, generating analytical reports and insights from high dimensional online user da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Comput. Sci. Inf. Syst.
دوره 12 شماره
صفحات -
تاریخ انتشار 2015